Similarity Language Model

نویسندگان

  • Christian Gillot
  • Christophe Cerisara
چکیده

The similarity language model is a statistical model that makes efficient use of long distance information when possible and falls back to standard ngram language model when not. To estimate the probability distribution of a given target context, each training example of the ngram model is retrieved and its similarity to the target context is estimated. In this work, this is done by performing a string alignment and training the system to estimate the similarity of each possible alignment. Whereas in the ngram model all such examples are deemed equal, the more similar an example is to the current context, the more weight it is given in the estimation of the probability distribution. The proposed model outperforms a modified Knener-Ney 4-gram model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

A Combined Similarity Measure for Determining Similarity of Model-based and Descriptive Requirements

RSL is a requirements specification language that was developed in the ReDSeeDS project. The language allows requirements specifications using a meta model that defines both modelbased and descriptive representations. Model-based representations provide activity and sequence diagrams similar to UML while descriptive representations offer scenarios in constrained language and sentence lists. In ...

متن کامل

Who’s afraid of similarity? Effects of phonological and semantic similarity on lexical acquisition

Children are sensitive to statistical regularities in speech and likely use these regularities when learning their native language. A central goal of current research is to understand which statistical regularities support different aspects of language acquisition and processing. In the current work we explore phonological and semantic similarity effects on early lexical acquisition. Using a co...

متن کامل

A Wikipedia-Based Multilingual Retrieval Model

This paper introduces CL-ESA, a new multilingual retrieval model for the analysis of cross-language similarity. The retrieval model exploits the multilingual alignment of Wikipedia: given a document d written in language L we construct a concept vector d for d, where each dimension i in d quantifies the similarity of d with respect to a document di chosen from the “L-subset” of Wikipedia. Likew...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011